assessment system
Transforming Student Evaluation with Adaptive Intelligence and Performance Analytics
S, Pushpalatha K, Mangalur, Abhishek, Hegde, Ketan, Badachi, Chetan, Aamir, Mohammad
The development in Artificial Intelligence (AI) offers transformative potential for redefining student assessment methodologies. This paper aims to establish the idea of the advancement of Artificial Intelligence (AI) and its prospect in reshaping approaches to assessing students. It creates a system for the evaluation of students performance using Artificial intelligence, and particularly the Gemini API for the generation of questions, grading and report on the students performances. This is to facilitate easy use of the tools in creating, scheduling, and delivering assessments with minimal chances of cheating through options such as full screen and time limit. There are formats of questions in the system which comprises multiple choice, short answers and descriptive questions, developed by Gemini. The most conspicuous feature is the self-checking system whereby the user gets instant feedback for the correct score that each of the students would have scored instantly with explanations about wrong answers. Moreover, the platform has intelligent learning progressions where the user will be able to monitor his/her performances to be recommended a certain level of performance. It will allow students as well as educators to have real-time analytics and feedback on what they are good at and where they need to improve. Not only does it make the assessment easier, but it also improves the levels of accuracy in grading and effectively strengthens a data based learning process for students.
Is LLM-as-a-Judge Robust? Investigating Universal Adversarial Attacks on Zero-shot LLM Assessment
Raina, Vyas, Liusie, Adian, Gales, Mark
Large Language Models (LLMs) are powerful zero-shot assessors used in real-world situations such as assessing written exams and benchmarking systems. Despite these critical applications, no existing work has analyzed the vulnerability of judge-LLMs to adversarial manipulation. This work presents the first study on the adversarial robustness of assessment LLMs, where we demonstrate that short universal adversarial phrases can be concatenated to deceive judge LLMs to predict inflated scores. Since adversaries may not know or have access to the judge-LLMs, we propose a simple surrogate attack where a surrogate model is first attacked, and the learned attack phrase then transferred to unknown judge-LLMs. We propose a practical algorithm to determine the short universal attack phrases and demonstrate that when transferred to unseen models, scores can be drastically inflated such that irrespective of the assessed text, maximum scores are predicted. It is found that judge-LLMs are significantly more susceptible to these adversarial attacks when used for absolute scoring, as opposed to comparative assessment. Our findings raise concerns on the reliability of LLM-as-a-judge methods, and emphasize the importance of addressing vulnerabilities in LLM assessment methods before deployment in high-stakes real-world scenarios.
A Global Transport Capacity Risk Prediction Method for Rail Transit Based on Gaussian Bayesian Network
Zhengyang, Zhang, Wei, Dong, jun, Liu, Xinya, Sun, Yindong, Ji
Rail transit plays an increasingly important role in modern Since transport capacity risks at the rail transit network level urban transportation with its advantages of large capacity, good have a large influence surface and propagation inertia, different punctuality, high safety, environmental friendliness and low cost, passenger flow conditions will also have different impacts on the and has become the backbone and important support of modern safety of the network, if effective preventive measures are not transportation. Although the safety of rail transit is higher than taken, once the risk propagation starts, it can easily lead to a that of conventional road traffic, due to the large scale of rail rapid decline in the safety of the whole network and eventually transit network, heavy transportation tasks and close coupling lead to safety accidents. Therefore, the prediction of transport between lines, once a failure or safety accident occurs, it will capacity risk on the basis of transport capacity risk assessment have a great impact on urban transportation. For example, on has important practical significance for the safe operation of rail December 22, 2009, around 7:00 a.m., a collision occurred on transit network.
Explainable Label-flipping Attacks on Human Emotion Assessment System
Zhang, Zhibo, Hammadi, Ahmed Y. Al, Damiani, Ernesto, Yeun, Chan Yeob
Abstract--This paper's main goal is to provide an attacker's The dataset was compiled to look at potential applications as current or past employees, and have inside knowledge of the of brainwave signals for spotting insider threats in the business's security protocols, customer information, and workplace. The Emotiv Insight 5 channels were the tool used computer systems. To assess human emotions and acts, speech to collect the data. Information from 17 people who gave their [1] and facial expression [2] data were employed traditionally. On the other hand, EEG The four risk categories--High-Risk, Medium-Risk, Low-signals [3] have been utilized in recent years to assess a Risk, and Normal--found in the risk matrix were used to person's emotional state to prevent potential industrial insider classify each signal for a captured image, and each signal was assaults because people cannot conceal or manipulate their then given the appropriate label.
L2 proficiency assessment using self-supervised speech representations
Bannรฒ, Stefano, Knill, Kate M., Matassoni, Marco, Raina, Vyas, Gales, Mark J. F.
There has been a growing demand for automated spoken language assessment systems in recent years. A standard pipeline for this process is to start with a speech recognition system and derive features, either hand-crafted or based on deep-learning, that exploit the transcription and audio. Though these approaches can yield high performance systems, they require speech recognition systems that can be used for L2 speakers, and preferably tuned to the specific form of test being deployed. Recently a self-supervised speech representation based scheme, requiring no speech recognition, was proposed. This work extends the initial analysis conducted on this approach to a large scale proficiency test, Linguaskill, that comprises multiple parts, each designed to assess different attributes of a candidate's speaking proficiency. The performance of the self-supervised, wav2vec 2.0, system is compared to a high performance hand-crafted assessment system and a BERT-based text system both of which use speech transcriptions. Though the wav2vec 2.0 based system is found to be sensitive to the nature of the response, it can be configured to yield comparable performance to systems requiring a speech transcription, and yields gains when appropriately combined with standard approaches.
What is automated essay scoring? - Assessment Systems
Automated essay scoring is an important application of machine learning and artificial intelligence to the field of psychometrics and assessment. In fact, it's been around far longer than "machine learning" and "artificial intelligence" have been buzzwords in the general public! The field of psychometrics has been doing such groundbreaking work for decades. So how does it work, and how can you apply it? The first and most critical thing to know is that there is not an algorithm that "reads" the student essays.
Improving Peer Assessment with Graph Convolutional Networks
Namanloo, Alireza A., Thorpe, Julie, Salehi-Abari, Amirali
Peer assessment systems are emerging in many social and multi-agent settings, such as peer grading in large (online) classes, peer review in conferences, peer art evaluation, etc. However, peer assessments might not be as accurate as expert evaluations, thus rendering these systems unreliable. The reliability of peer assessment systems is influenced by various factors such as assessment ability of peers, their strategic assessment behaviors, and the peer assessment setup (e.g., peer evaluating group work or individual work of others). In this work, we first model peer assessment as multi-relational weighted networks that can express a variety of peer assessment setups, plus capture conflicts of interest and strategic behaviors. Leveraging our peer assessment network model, we introduce a graph convolutional network which can learn assessment patterns and user behaviors to more accurately predict expert evaluations. Our extensive experiments on real and synthetic datasets demonstrate the efficacy of our proposed approach, which outperforms existing peer assessment methods.
Algorithm Based on One Monocular Video Delivers Highly Valid and Reliable Gait Parameters
Azhand, Arash, Rabe, Sophie, Mรผller, Swantje, Sattler, Igor, Steinert, Anika
Despite its paramount importance for manifold use cases (e.g., in the health care industry, sports, rehabilitation and fitness assessment), sufficiently valid and reliable gait parameter measurement is still limited to high-tech gait laboratories mostly. Here, we demonstrate the excellent validity and test-retest repeatability of a novel gait assessment system which is built upon modern convolutional neural networks to extract three-dimensional skeleton joints from monocular frontal-view videos of walking humans. The validity study is based on a comparison to the GAITRite pressure-sensitive walkway system. All measured gait parameters (gait speed, cadence, step length and step time) showed excellent concurrent validity for multiple walk trials at normal and fast gait speeds. The test-retest-repeatability is on the same level as the GAITRite system. In conclusion, we are convinced that our results can pave the way for cost, space and operationally effective gait analysis in broad mainstream applications. Most sensor-based systems are costly, must be operated by extensively trained personnel (e.g., motion capture systems) or - even if not quite as costly - still possess considerable complexity (e.g., wearable sensors). In contrast, a video sufficient for the assessment method presented here can be obtained by anyone, without much training, via a smartphone camera.
An Intelligent Passive Food Intake Assessment System with Egocentric Cameras
Lo, Frank Po Wen, Jobarteh, Modou L, Sun, Yingnan, Qiu, Jianing, Jiang, Shuo, Frost, Gary, Lo, Benny
Malnutrition is a major public health concern in low-and-middle-income countries (LMICs). Understanding food and nutrient intake across communities, households and individuals is critical to the development of health policies and interventions. To ease the procedure in conducting large-scale dietary assessments, we propose to implement an intelligent passive food intake assessment system via egocentric cameras particular for households in Ghana and Uganda. Algorithms are first designed to remove redundant images for minimising the storage memory. At run time, deep learning-based semantic segmentation is applied to recognise multi-food types and newly-designed handcrafted features are extracted for further consumed food weight monitoring. Comprehensive experiments are conducted to validate our methods on an in-the-wild dataset captured under the settings which simulate the unique LMIC conditions with participants of Ghanaian and Kenyan origin eating common Ghanaian/Kenyan dishes. To demonstrate the efficacy, experienced dietitians are involved in this research to perform the visual portion size estimation, and their predictions are compared to our proposed method. The promising results have shown that our method is able to reliably monitor food intake and give feedback on users' eating behaviour which provides guidance for dietitians in regular dietary assessment.
How is AI Changing the World of Assessments?
Artificial Intelligence was existed only in the domain of science fiction and fantasy until last few years. However, it has become a part of our normal lives today, in social as well as the business environment. From military, automotive, agriculture, legal, healthcare to education, this technology has touched in almost every field and sector impacting human lives to a great extend. AI systems are capable enough to reduce human efforts in numerous areas. Its applications help to get the work done faster and with accurate results.